Search CORE

390 research outputs found

Deep Learning for Semantic Part Segmentation with High-Level Guidance

Author: Kokkinos I.
Papandreou G.
Tsogkas S.
Vedaldi A.
Publication venue
Publication date: 01/01/2015
Field of study

In this work we address the task of segmenting an object into its parts, or semantic part segmentation. We start by adapting a state-of-the-art semantic segmentation system to this task, and show that a combination of a fully-convolutional Deep CNN system coupled with Dense CRF labelling provides excellent results for a broad range of object categories. Still, this approach remains agnostic to high-level constraints between object parts. We introduce such prior information by means of the Restricted Boltzmann Machine, adapted to our task and train our model in an discriminative fashion, as a hidden CRF, demonstrating that prior information can yield additional improvements. We also investigate the performance of our approach ``in the wild'', without information concerning the objects' bounding boxes, using an object detector to guide a multi-scale segmentation scheme. We evaluate the performance of our approach on the Penn-Fudan and LFW datasets for the tasks of pedestrian parsing and face labelling respectively. We show superior performance with respect to competitive methods that have been extensively engineered on these benchmarks, as well as realistic qualitative results on part segmentation, even for occluded or deformable objects. We also provide quantitative and extensive qualitative results on three classes from the PASCAL Parts dataset. Finally, we show that our multi-scale segmentation scheme can boost accuracy, recovering segmentations for finer parts.Comment: 11 pages (including references), 3 figures, 2 table

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

Author: Albanie S
Hu J
Shen L
Sun G
Vedaldi A
Publication venue
Publication date: 01/01/2018
Field of study

While the use of bottom-up local operators in convolutional neural networks (CNNs) matches well some of the statistics of natural images, it may also prevent such models from capturing contextual long-range feature interactions. In this work, we propose a simple, lightweight approach for better context exploitation in CNNs. We do so by introducing a pair of operators: gather, which efficiently aggregates feature responses from a large spatial extent, and excite, which redistributes the pooled information to local features. The operators are cheap, both in terms of number of added parameters and computational complexity, and can be integrated directly in existing architectures to improve their performance. Experiments on several datasets show that gather-excite can bring benefits comparable to increasing the depth of a CNN at a fraction of the cost. For example, we find ResNet-50 with gather-excite operators is able to outperform its 101-layer counterpart on ImageNet with no additional learnable parameters. We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.Comment: NeurIPS 201

arXiv.org e-Print Archive

Oxford University Research Archive

CUED - Cambridge University Engineering Department

PASS: An ImageNet replacement for self-supervised pretraining without humans

Author: Asano Y.M.
Rupprecht C.
Vedaldi A.
Zisserman A.
Publication venue
Publication date: 01/01/2021
Field of study

Computer vision has long relied on ImageNet and other large datasets of images sampled from the Internet for pretraining models. However, these datasets have ethical and technical shortcomings, such as containing personal information taken without consent, unclear license usage, biases, and, in some cases, even problematic image content. On the other hand, state-of-the-art pretraining is nowadays obtained with unsupervised methods, meaning that labelled datasets such as ImageNet may not be necessary, or perhaps not even optimal, for model pretraining. We thus propose an unlabelled dataset PASS: Pictures without humAns for Self-Supervision. PASS only contains images with CC-BY license and complete attribution metadata, addressing the copyright issue. Most importantly, it contains no images of people at all, and also avoids other types of images that are problematic for data protection or ethics. We show that PASS can be used for pretraining with methods such as MoCo-v2, SwAV and DINO. In the transfer learning setting, it yields similar downstream performances to ImageNet pretraining even on tasks that involve humans, such as human pose estimation. PASS does not make existing datasets obsolete, as for instance it is insufficient for benchmarking. However, it shows that model pretraining is often possible while using safer data, and it also provides the basis for a more robust evaluation of pretraining methods

Oxford University Research Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Deep Filter Banks for Texture Recognition, Description, and Segmentation

Author: Cimpoi M
Kokkinos I
Maji S
Vedaldi A
Publication venue
Publication date: 01/05/2016
Field of study

Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture represenations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another

UCL Discovery

Supervised Versus Unsupervised Deep Learning Based Methods for Skin Lesion Segmentation in Dermoscopy Images

Author: A Vedaldi
O Ronneberger
P Felzenszwalb
R Achanta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Image segmentation is considered a crucial step in automatic dermoscopic image analysis as it affects the accuracy of subsequent steps. The huge progress in deep learning has recently revolutionized the image recognition and computer vision domains. In this paper, we compare a supervised deep learning based approach with an unsupervised deep learning based approach for the task of skin lesion segmentation in dermoscopy images. Results show that, by using the default parameter settings and network configurations proposed in the original approaches, although the unsupervised approach could detect fine structures of skin lesions in some occasions, the supervised approach shows much higher accuracy in terms of Dice coefficient and Jaccard index compared to the unsupervised approach, resulting in 77.7% vs. 40% and 67.2% vs. 30.4%, respectively. With a proposed modification to the unsupervised approach, the Dice and Jaccard values improved to 54.3% and 44%, respectively

Crossref

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Transferring Dense Pose to proximal animal classes

Author: Khalidov V.
McCarthy M.
Neverova N.
Sanakoyeu A.
Vedaldi A.
Publication venue
Publication date: 01/01/2020
Field of study

Recent contributions have demonstrated that it is possible to recognize the pose of humans densely and accurately given a large dataset of poses annotated in detail. In principle, the same approach could be extended to any animal class, but the effort required for collecting new annotations for each case makes this strategy impractical, despite important applications in natural conservation, science and business. We show that, at least for proximal animal classes such as chimpanzees, it is possible to transfer the knowledge existing in dense pose recognition for humans, as well as in more general object detectors and segmenters, to the problem of dense pose recognition in other classes. We do this by (1) establishing a DensePose model for the new animal which is also geometrically aligned to humans (2) introducing a multi-head R-CNN architecture that facilitates transfer of multiple recognition tasks between classes, (3) finding which combination of known classes can be transferred most effectively to the new animal and (4) using self-calibrated uncertainty heads to generate pseudo-labels graded by quality for training a model for this class. We also introduce two benchmark datasets labelled in the manner of DensePose for the class chimpanzee and use them to evaluate our approach, showing excellent transfer learning performance

MPG.PuRe

Deep Learning for Vanishing Point Detection Using an Inverse Gnomonic Projection

Author: A Almansa
A Criminisi
A Geiger
A Vedaldi
C Rother
F Pedregosa
J Košecká
O Barinova
P Beardsley
P Denis
R Hartley
RG Gioi von
ST Barnard
Y LeCun
Y Ueda
Publication venue
Publication date: 15/08/2017
Field of study

We present a novel approach for vanishing point detection from uncalibrated monocular images. In contrast to state-of-the-art, we make no a priori assumptions about the observed scene. Our method is based on a convolutional neural network (CNN) which does not use natural images, but a Gaussian sphere representation arising from an inverse gnomonic projection of lines detected in an image. This allows us to rely on synthetic data for training, eliminating the need for labelled images. Our method achieves competitive performance on three horizon estimation benchmark datasets. We further highlight some additional use cases for which our vanishing point detection algorithm can be used.Comment: Accepted for publication at German Conference on Pattern Recognition (GCPR) 2017. This research was supported by German Research Foundation DFG within Priority Research Programme 1894 "Volunteered Geographic Information: Interpretation, Visualisation and Social Computing

arXiv.org e-Print Archive

Crossref

University of Twente Research Information

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

Author: Asano Y.
Campbell D.
Feichtenhofer C.
Henriques J.
Metze F.
Misra I.
Patrick M.
Vedaldi A.
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

Author: Asano Y.
Campbell D.
Feichtenhofer C.
Henriques J.
Metze F.
Misra I.
Patrick M.
Vedaldi A.
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

The Phototoxicity of Fluvastatin, an HMG-CoA Reductase Inhibitor, Is Mediated by the formation of a Benzocarbazole-Like Photoproduct

Author: Alessia Salvador
Daniela Vedaldi
Francesco Dall'Acqua
Giampietro Viola
Giuseppe Basso
Jadwiga Mielcarek
Maria A. Linardi
Pawel Grobelny
Stefano Dall'Acqua
Publication venue
Publication date: 28/07/2010
Field of study

In this paper, we have investigated the mechanism of phototoxicity of fluvastatin, an 3-hydroxy-3-methylglutaryl coenzyme A reductase inhibitor, in human keratinocytes cell line NCTC-2544. Fluvastatin underwent rapid photodegradation upon Ultraviolet-A (UVA) irradiation in buffered aqueous solution as shown by the changes in absorption spectra. Interestingly, no isosbestic points were observed but only a fast appearance of a spectral change, indicative of the formation of a new chromophore. The isolation and characterization of the main photoproduct revealed the formation of a polycyclic compound with a benzocarbazole-like structure. This product was also evaluated for its phototoxic potential. Cell phototoxicity was evaluated by 3-(4,5-dimethylthiazol-2-yl)-2,5 diphenyl tetrazolium bromide test after 72 h from the irradiation in the presence of fluvastatin. The results showed a reduction of the cell viability in a concentration and UVA dose-dependent manner. Surprisingly, the photoproduct showed a dramatic decrease of the cell viability that occurred at concentrations of an order of magnitude lower than the parent compound. Flow cytometric analysis indicated that fluvastatin and its main photoproduct induced principally necrosis as revealed by the large appearance of propidium iodide-positive cells and confirmed also by the rapid drop in cellular adenosine triphosphate levels. Interestingly, a rapid increase of intracellular calcium followed by an extensive cell lipid membrane peroxidation and a significant oxidation of model proteins were induced by fluvastatin and its photoproduct, suggesting that these compounds exerted their toxic effect mainly in the cellular membranes. On the basis of our results, the phototoxicity of fluvastatin may be mediated by the formation of benzocarbazole-like photoproduct that acts as strong photosensitizer

Open Access Repository